Search CORE

14 research outputs found

Introspective knowledge acquisition for case retrieval networks in textual case base reasoning.

Author: Chakraborti Sutanu
Publication venue
Publication date: 31/08/2007
Field of study

Textual Case Based Reasoning (TCBR) aims at effective reuse of information contained in unstructured documents. The key advantage of TCBR over traditional Information Retrieval systems is its ability to incorporate domain-specific knowledge to facilitate case comparison beyond simple keyword matching. However, substantial human intervention is needed to acquire and transform this knowledge into a form suitable for a TCBR system. In this research, we present automated approaches that exploit statistical properties of document collections to alleviate this knowledge acquisition bottleneck. We focus on two important knowledge containers: relevance knowledge, which shows relatedness of features to cases, and similarity knowledge, which captures the relatedness of features to each other. The terminology is derived from the Case Retrieval Network (CRN) retrieval architecture in TCBR, which is used as the underlying formalism in this thesis applied to text classification. Latent Semantic Indexing (LSI) generated concepts are a useful resource for relevance knowledge acquisition for CRNs. This thesis introduces a supervised LSI technique called sprinkling that exploits class knowledge to bias LSI's concept generation. An extension of this idea, called Adaptive Sprinkling has been proposed to handle inter-class relationships in complex domains like hierarchical (e.g. Yahoo directory) and ordinal (e.g. product ranking) classification tasks. Experimental evaluation results show the superiority of CRNs created with sprinkling and AS, not only over LSI on its own, but also over state-of-the-art classifiers like Support Vector Machines (SVM). Current statistical approaches based on feature co-occurrences can be utilized to mine similarity knowledge for CRNs. However, related words often do not co-occur in the same document, though they co-occur with similar words. We introduce an algorithm to efficiently mine such indirect associations, called higher order associations. Empirical results show that CRNs created with the acquired similarity knowledge outperform both LSI and SVM. Incorporating acquired knowledge into the CRN transforms it into a densely connected network. While improving retrieval effectiveness, this has the unintended effect of slowing down retrieval. We propose a novel retrieval formalism called the Fast Case Retrieval Network (FCRN) which eliminates redundant run-time computations to improve retrieval speed. Experimental results show FCRN's ability to scale up over high dimensional textual casebases. Finally, we investigate novel ways of visualizing and estimating complexity of textual casebases that can help explain performance differences across casebases. Visualization provides a qualitative insight into the casebase, while complexity is a quantitative measure that characterizes classification or retrieval hardness intrinsic to a dataset. We study correlations of experimental results from the proposed approaches against complexity measures over diverse casebases

Open Access Institutional Repository at Robert Gordon University

Conceptualizing Curse of Dimensionality with Parallel Coordinates

Author: Chakraborti Sutanu
Chauhan Charu
Devi G.
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 05/03/2016
Field of study

We report on a novel use of parallel coordinates as a pedagogical tool for illustrating the non-intuitive properties of high dimensional spaces with special emphasis on the phenomenon of Curse of Dimensionality. Also, we have collated what we believe to be a representative sample of diverse approaches that exist in literature to conceptualize the Curse of Dimensionality. We envisage that the paper will have pedagogical value in structuring the way Curse of Dimensionality is presented in classrooms and associated lab sessions

Association for the Advancement of Artificial Intelligence: AAAI Publications

More or Better: On Trade-offs in Compacting Textual Problem Solution Repositories

Author: Chakraborti Sutanu
Khemani Deepak
Padmanabhan Deepak
Publication venue
Publication date: 01/01/2011
Field of study

Queen's University Belfast Research Portal

Never judge a Case by its (unreliable) neighbors: Estimating Case Reliability for CBR

Author: Chakraborti Sutanu
P. Deepak
Parsodkar Adwait P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/08/2023
Field of study

Queen's University Belfast Research Portal

Revisiting Fast and Slow Thinking in Case-Based Reasoning

Author: Chakraborti Sutanu
Ganesan Devi
Kaurav Srashti
P Deepak
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/09/2021
Field of study

Queen's University Belfast Research Portal

Thinking Fast and Slow: A CBR Perspective

Author: Chakraborti Sutanu
Ganesan Devi
Kaurav Srashti
Padmanabhan Deepak
Publication venue: 'University of Florida George A Smathers Libraries'
Publication date: 18/04/2021
Field of study

Queen's University Belfast Research Portal

Counterfactuals as Explanations for Monotonic Classifiers

Author: Chakraborti Sutanu
K Sarathi
Mitra Shania
P Deepak
Publication venue
Publication date: 23/07/2022
Field of study

Queen's University Belfast Research Portal

Towards Richer Realizations of Holographic CBR

Author: Chakraborti Sutanu
Ganesan Devi
Padmanabhan Deepak
Subramanian Renganathan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/09/2021
Field of study

Queen's University Belfast Research Portal

Propositional approach to textual case indexing

Author: Ivan Koychev
Nirmalie Wiratunga
Rob Lothian
Sutanu Chakraborti
Publication venue
Publication date
Field of study

Problem solving with experiences that are recorded in text form requires a mapping from text to structured cases, so that case comparison can provide informed feedback for reasoning. One of the challenges is to acquire an indexing vocabulary to describe cases. We explore the use of machine learning and statistical techniques to automate aspects of this acquisition task. A propositional semantic indexing tool, PSI, which forms its indexing vocabulary from new features extracted as logical combinations of existing keywords, is presented. We propose that such logical combinations correspond more closely to natural concepts and are more transparent than linear combinations. Experiments show PSIderived case representations to have superior retrieval performance to the original keyword-based representations. PSI also has comparable performance to Latent Semantic Indexing, a popular dimensionality reduction technique for text, which unlike PSI generates linear combinations of the original features

CiteSeerX